Lecture 02

Statistics 422

Welcome Back

Announcements

  • Advisors and Advising
  • Data
  • Software
  • Reproducible

Setup

# load packages
library(tidyverse)
library(readr)
library(viridis)
# set figure parameters for knitr
knitr::opts_chunk$set(
  fig.align = "center", # center align figures
  fig.asp = 0.618,      # the golden ratio
  fig.retina = 3,       # dpi multiplier for displaying HTML output on retina
  fig.height = 4,        # 4 inches
  dpi = 200             # higher is crisper
)

There is a textbook (optional)

Summary Points for Chapter 1

  • Color is a highly effective way to communicate information and highlight specific elements within a visualization.
  • Color > Shape
  • Large numbers of observations should be summarized
  • Combining color, shape, size, texture, can be tricky opacity

Summary Points for Chapter 1 (cont’d)

Article (optional) by Heer & Bostock

Summary Points for Chapter 1 (cont’d)

Recall storytelling

Summary Points for Chapter 1 (cont’d)

Flexibility is important too. Rules? Not really.

Back to ggplot

Geoms

One Continuous Variable

One Continuous Variable

One Continuous Variable (cont’d)

One Discrete Variable

Typically non-numeric but ordered or limited values work here

One Discrete Variable

Two Continuous Variables

Two Continuous Variables

Two Continuous Variables

Two Continuous Variables

One Discrete/One Continuous

Discrete & Continuous

geom_col requires x and y values. 

The heights of the bars represent __values__ 
in the data

The trick to ordering is to utilize factors 

One Discrete/One Continuous (cont’d)

One Discrete/One Continuous (cont’d)

Two Continuous

Two Continuous Variables

Two Continuous Variables

Two Continuous Variables

Two Discrete(?) Variables

Tile

Two Discrete(?) Variables

Tile + Facet